220 research outputs found

    Automatically Neutralizing Subjective Bias in Text

    Full text link
    Texts like news, encyclopedias, and some social media strive for objectivity. Yet bias in the form of inappropriate subjectivity - introducing attitudes via framing, presupposing truth, and casting doubt - remains ubiquitous. This kind of bias erodes our collective trust and fuels social conflict. To address this issue, we introduce a novel testbed for natural language generation: automatically bringing inappropriately subjective text into a neutral point of view ("neutralizing" biased text). We also offer the first parallel corpus of biased language. The corpus contains 180,000 sentence pairs and originates from Wikipedia edits that removed various framings, presuppositions, and attitudes from biased sentences. Last, we propose two strong encoder-decoder baselines for the task. A straightforward yet opaque CONCURRENT system uses a BERT encoder to identify subjective words as part of the generation process. An interpretable and controllable MODULAR algorithm separates these steps, using (1) a BERT-based classifier to identify problematic words and (2) a novel join embedding through which the classifier can edit the hidden states of the encoder. Large-scale human evaluation across four domains (encyclopedias, news headlines, books, and political speeches) suggests that these algorithms are a first step towards the automatic identification and reduction of bias.Comment: To appear at AAAI 202

    CLIMB: Curriculum Learning for Infant-inspired Model Building

    Get PDF
    We describe our team's contribution to the STRICT-SMALL track of the BabyLM Challenge. The challenge requires training a language model from scratch using only a relatively small training dataset of ten million words. We experiment with three variants of cognitively-motivated curriculum learning and analyze their effect on the performance of the model on linguistic evaluation tasks. In the vocabulary curriculum, we analyze methods for constraining the vocabulary in the early stages of training to simulate cognitively more plausible learning curves. In the data curriculum experiments, we vary the order of the training instances based on i) infant-inspired expectations and ii) the learning behavior of the model. In the objective curriculum, we explore different variations of combining the conventional masked language modeling task with a more coarse-grained word class prediction task to reinforce linguistic generalization capabilities. Our results did not yield consistent improvements over our own non-curriculum learning baseline across a range of linguistic benchmarks; however, we do find marginal gains on select tasks. Our analysis highlights key takeaways for specific combinations of tasks and settings which benefit from our proposed curricula. We moreover determine that careful selection of model architecture, and training hyper-parameters yield substantial improvements over the default baselines provided by the BabyLM challenge

    CLIMB: Curriculum Learning for Infant-inspired Model Building

    Full text link
    We describe our team's contribution to the STRICT-SMALL track of the BabyLM Challenge. The challenge requires training a language model from scratch using only a relatively small training dataset of ten million words. We experiment with three variants of cognitively-motivated curriculum learning and analyze their effect on the performance of the model on linguistic evaluation tasks. In the vocabulary curriculum, we analyze methods for constraining the vocabulary in the early stages of training to simulate cognitively more plausible learning curves. In the data curriculum experiments, we vary the order of the training instances based on i) infant-inspired expectations and ii) the learning behavior of the model. In the objective curriculum, we explore different variations of combining the conventional masked language modeling task with a more coarse-grained word class prediction task to reinforce linguistic generalization capabilities. Our results did not yield consistent improvements over our own non-curriculum learning baseline across a range of linguistic benchmarks; however, we do find marginal gains on select tasks. Our analysis highlights key takeaways for specific combinations of tasks and settings which benefit from our proposed curricula. We moreover determine that careful selection of model architecture, and training hyper-parameters yield substantial improvements over the default baselines provided by the BabyLM challenge

    Human adaptation of Ebola virus during the West African outbreak

    Get PDF
    The 2013–2016 outbreak of Ebola virus (EBOV) in West Africa was the largest recorded. It began following the cross-species transmission of EBOV from an animal reservoir, most likely bats, into humans, with phylogenetic analysis revealing the cocirculation of several viral lineages. We hypothesized that this prolonged human circulation led to genomic changes that increased viral transmissibility in humans. We generated a synthetic glycoprotein (GP) construct based on the earliest reported isolate and introduced amino acid substitutions that defined viral lineages. Mutant GPs were used to generate a panel of pseudoviruses, which were used to infect different human and bat cell lines. These data revealed that specific amino acid substitutions in the EBOV GP have increased tropism for human cells, while reducing tropism for bat cells. Such increased infectivity may have enhanced the ability of EBOV to transmit among humans and contributed to the wide geographic distribution of some viral lineages

    Catching Element Formation In The Act

    Full text link
    Gamma-ray astronomy explores the most energetic photons in nature to address some of the most pressing puzzles in contemporary astrophysics. It encompasses a wide range of objects and phenomena: stars, supernovae, novae, neutron stars, stellar-mass black holes, nucleosynthesis, the interstellar medium, cosmic rays and relativistic-particle acceleration, and the evolution of galaxies. MeV gamma-rays provide a unique probe of nuclear processes in astronomy, directly measuring radioactive decay, nuclear de-excitation, and positron annihilation. The substantial information carried by gamma-ray photons allows us to see deeper into these objects, the bulk of the power is often emitted at gamma-ray energies, and radioactivity provides a natural physical clock that adds unique information. New science will be driven by time-domain population studies at gamma-ray energies. This science is enabled by next-generation gamma-ray instruments with one to two orders of magnitude better sensitivity, larger sky coverage, and faster cadence than all previous gamma-ray instruments. This transformative capability permits: (a) the accurate identification of the gamma-ray emitting objects and correlations with observations taken at other wavelengths and with other messengers; (b) construction of new gamma-ray maps of the Milky Way and other nearby galaxies where extended regions are distinguished from point sources; and (c) considerable serendipitous science of scarce events -- nearby neutron star mergers, for example. Advances in technology push the performance of new gamma-ray instruments to address a wide set of astrophysical questions.Comment: 14 pages including 3 figure

    Trypanosoma brucei Glycogen Synthase Kinase-3, A Target for Anti-Trypanosomal Drug Development: A Public-Private Partnership to Identify Novel Leads

    Get PDF
    Over 60 million people in sub-Saharan Africa are at risk of infection with the parasite Trypanosoma brucei which causes Human African Trypanosomiasis (HAT), also known as sleeping sickness. The disease results in systemic and neurological disability to its victims. At present, only four drugs are available for treatment of HAT. However, these drugs are expensive, limited in efficacy and are severely toxic, hence the need to develop new therapies. Previously, the short TbruGSK-3 short has been validated as a potential target for developing new drugs against HAT. Because this enzyme has also been pursued as a drug target for other diseases, several inhibitors are available for screening against the parasite enzyme. Here we present the results of screening over 16,000 inhibitors of human GSK-3ÎČ (HsGSK-3) from the Pfizer compound collection against TbruGSK-3 short. The resulting active compounds were tested for selectivity versus HsGSK-3ÎČ and a panel of human kinases, as well as their ability to inhibit proliferation of the parasite in vitro. We have identified attractive compounds that now form potential starting points for drug discovery against HAT. This is an example of how a tripartite partnership involving pharmaceutical industries, academic institutions and non-government organisations such as WHO TDR, can stimulate research for neglected diseases
    • 

    corecore